Overview

Dataset statistics

Number of variables12
Number of observations8523
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory799.2 KiB
Average record size in memory96.0 B

Variable types

Numeric8
Categorical4

Warnings

Item_MRP is highly correlated with Item_Outlet_SalesHigh correlation
Outlet_Identifier is highly correlated with Outlet_Size and 1 other fieldsHigh correlation
Outlet_Size is highly correlated with Outlet_Identifier and 1 other fieldsHigh correlation
Outlet_Location_Type is highly correlated with Outlet_Identifier and 1 other fieldsHigh correlation
Item_Outlet_Sales is highly correlated with Item_MRPHigh correlation
Item_MRP is highly correlated with Item_Outlet_SalesHigh correlation
Outlet_Identifier is highly correlated with Outlet_Location_TypeHigh correlation
Outlet_Size is highly correlated with Outlet_Location_TypeHigh correlation
Outlet_Location_Type is highly correlated with Outlet_Identifier and 1 other fieldsHigh correlation
Item_Outlet_Sales is highly correlated with Item_MRPHigh correlation
Outlet_Identifier is highly correlated with Outlet_Location_TypeHigh correlation
Outlet_Size is highly correlated with Outlet_Location_TypeHigh correlation
Outlet_Location_Type is highly correlated with Outlet_Identifier and 1 other fieldsHigh correlation
Item_Weight is highly correlated with Outlet_Identifier and 1 other fieldsHigh correlation
Outlet_Location_Type is highly correlated with Outlet_Identifier and 3 other fieldsHigh correlation
Item_MRP is highly correlated with Item_Outlet_SalesHigh correlation
Outlet_Identifier is highly correlated with Item_Weight and 5 other fieldsHigh correlation
Item_Identifier is highly correlated with Item_TypeHigh correlation
Outlet_Type is highly correlated with Item_Weight and 3 other fieldsHigh correlation
Item_Type is highly correlated with Item_IdentifierHigh correlation
Outlet_Size is highly correlated with Outlet_Location_Type and 2 other fieldsHigh correlation
Outlet_Establishment_Year is highly correlated with Outlet_Location_Type and 3 other fieldsHigh correlation
Item_Outlet_Sales is highly correlated with Item_MRP and 1 other fieldsHigh correlation
Outlet_Type is highly correlated with Outlet_Location_TypeHigh correlation
Outlet_Location_Type is highly correlated with Outlet_TypeHigh correlation
Item_Visibility has 526 (6.2%) zeros Zeros
Item_Type has 648 (7.6%) zeros Zeros
Outlet_Identifier has 555 (6.5%) zeros Zeros

Reproduction

Analysis started2021-07-31 05:37:43.869381
Analysis finished2021-07-31 05:38:23.882976
Duration40.01 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Item_Identifier
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1559
Distinct (%)18.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean779.7148891
Minimum0
Maximum1558
Zeros6
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:24.089858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile77.1
Q1395.5
median783
Q31167
95-th percentile1477
Maximum1558
Range1558
Interquartile range (IQR)771.5

Descriptive statistics

Standard deviation449.2223766
Coefficient of variation (CV)0.5761367172
Kurtosis-1.195555354
Mean779.7148891
Median Absolute Deviation (MAD)386
Skewness-0.008877177849
Sum6645510
Variance201800.7436
MonotonicityNot monotonic
2021-07-31T11:08:24.397679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41310
 
0.1%
107710
 
0.1%
7029
 
0.1%
3909
 
0.1%
14549
 
0.1%
15429
 
0.1%
7509
 
0.1%
12769
 
0.1%
359
 
0.1%
3019
 
0.1%
Other values (1549)8431
98.9%
ValueCountFrequency (%)
06
0.1%
17
0.1%
28
0.1%
33
 
< 0.1%
45
0.1%
54
< 0.1%
66
0.1%
77
0.1%
86
0.1%
94
< 0.1%
ValueCountFrequency (%)
15587
0.1%
15575
0.1%
15565
0.1%
15555
0.1%
15547
0.1%
15534
< 0.1%
15527
0.1%
15516
0.1%
15507
0.1%
15493
< 0.1%

Item_Weight
Real number (ℝ≥0)

HIGH CORRELATION

Distinct416
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.85764518
Minimum4.555
Maximum21.35
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:24.751477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4.555
5-th percentile6.13
Q19.31
median12.85764518
Q316
95-th percentile20.19
Maximum21.35
Range16.795
Interquartile range (IQR)6.69

Descriptive statistics

Standard deviation4.226123725
Coefficient of variation (CV)0.3286856702
Kurtosis-0.8602944788
Mean12.85764518
Median Absolute Deviation (MAD)3.342354816
Skewness0.09056145192
Sum109585.7099
Variance17.86012174
MonotonicityNot monotonic
2021-07-31T11:08:25.024342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.857645181463
 
17.2%
12.1586
 
1.0%
17.682
 
1.0%
13.6577
 
0.9%
11.876
 
0.9%
15.168
 
0.8%
9.368
 
0.8%
16.766
 
0.8%
10.566
 
0.8%
19.3563
 
0.7%
Other values (406)6408
75.2%
ValueCountFrequency (%)
4.5554
< 0.1%
4.595
0.1%
4.617
0.1%
4.6154
< 0.1%
4.6355
0.1%
4.7855
0.1%
4.8054
< 0.1%
4.885
0.1%
4.9052
 
< 0.1%
4.925
0.1%
ValueCountFrequency (%)
21.357
 
0.1%
21.2524
 
0.3%
21.25
 
0.1%
21.117
 
0.2%
216
 
0.1%
20.8535
0.4%
20.7539
0.5%
20.762
0.7%
20.638
0.4%
20.544
0.5%

Item_Fat_Content
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size66.7 KiB
1
5089 
2
2889 
0
 
316
4
 
117
3
 
112

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8523
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
15089
59.7%
22889
33.9%
0316
 
3.7%
4117
 
1.4%
3112
 
1.3%

Length

2021-07-31T11:08:25.480058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-31T11:08:25.625974image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
15089
59.7%
22889
33.9%
0316
 
3.7%
4117
 
1.4%
3112
 
1.3%

Most occurring characters

ValueCountFrequency (%)
15089
59.7%
22889
33.9%
0316
 
3.7%
4117
 
1.4%
3112
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8523
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
15089
59.7%
22889
33.9%
0316
 
3.7%
4117
 
1.4%
3112
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Common8523
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
15089
59.7%
22889
33.9%
0316
 
3.7%
4117
 
1.4%
3112
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15089
59.7%
22889
33.9%
0316
 
3.7%
4117
 
1.4%
3112
 
1.3%

Item_Visibility
Real number (ℝ≥0)

ZEROS

Distinct7880
Distinct (%)92.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06613202878
Minimum0
Maximum0.328390948
Zeros526
Zeros (%)6.2%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:25.851844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.0269894775
median0.053930934
Q30.0945852925
95-th percentile0.1637797636
Maximum0.328390948
Range0.328390948
Interquartile range (IQR)0.067595815

Descriptive statistics

Standard deviation0.05159782232
Coefficient of variation (CV)0.7802243977
Kurtosis1.679445483
Mean0.06613202878
Median Absolute Deviation (MAD)0.030972154
Skewness1.16709055
Sum563.6432813
Variance0.002662335268
MonotonicityNot monotonic
2021-07-31T11:08:26.153692image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0526
 
6.2%
0.0769751183
 
< 0.1%
0.0412833612
 
< 0.1%
0.0856223622
 
< 0.1%
0.1878410822
 
< 0.1%
0.1349756282
 
< 0.1%
0.1072236322
 
< 0.1%
0.0852749882
 
< 0.1%
0.0768556282
 
< 0.1%
0.0598356592
 
< 0.1%
Other values (7870)7978
93.6%
ValueCountFrequency (%)
0526
6.2%
0.0035746981
 
< 0.1%
0.0035891041
 
< 0.1%
0.0035976781
 
< 0.1%
0.0035993781
 
< 0.1%
0.0036067261
 
< 0.1%
0.0036124111
 
< 0.1%
0.0052097911
 
< 0.1%
0.0052307861
 
< 0.1%
0.0052341531
 
< 0.1%
ValueCountFrequency (%)
0.3283909481
< 0.1%
0.3257808071
< 0.1%
0.321115011
< 0.1%
0.3110903791
< 0.1%
0.3093902551
< 0.1%
0.3081454481
< 0.1%
0.3065428481
< 0.1%
0.3053053971
< 0.1%
0.3048591041
< 0.1%
0.3047373871
< 0.1%

Item_Type
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.226680746
Minimum0
Maximum15
Zeros648
Zeros (%)7.6%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:26.387536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q14
median6
Q310
95-th percentile14
Maximum15
Range15
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.209989863
Coefficient of variation (CV)0.5825620379
Kurtosis-0.9662196602
Mean7.226680746
Median Absolute Deviation (MAD)3
Skewness0.1016546244
Sum61593
Variance17.72401464
MonotonicityNot monotonic
2021-07-31T11:08:26.573429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
61232
14.5%
131200
14.1%
9910
10.7%
5856
10.0%
4682
8.0%
3649
7.6%
0648
7.6%
8520
6.1%
14445
 
5.2%
10425
 
5.0%
Other values (6)956
11.2%
ValueCountFrequency (%)
0648
7.6%
1251
 
2.9%
2110
 
1.3%
3649
7.6%
4682
8.0%
5856
10.0%
61232
14.5%
7214
 
2.5%
8520
6.1%
9910
10.7%
ValueCountFrequency (%)
15148
 
1.7%
14445
 
5.2%
131200
14.1%
1264
 
0.8%
11169
 
2.0%
10425
 
5.0%
9910
10.7%
8520
6.1%
7214
 
2.5%
61232
14.5%

Item_MRP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5938
Distinct (%)69.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean140.992782
Minimum31.29
Maximum266.8884
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:26.805297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum31.29
5-th percentile42.5167
Q193.8265
median143.0128
Q3185.6437
95-th percentile250.76924
Maximum266.8884
Range235.5984
Interquartile range (IQR)91.8172

Descriptive statistics

Standard deviation62.27506651
Coefficient of variation (CV)0.4416897492
Kurtosis-0.8897690937
Mean140.992782
Median Absolute Deviation (MAD)46.0376
Skewness0.1272022683
Sum1201681.481
Variance3878.183909
MonotonicityNot monotonic
2021-07-31T11:08:27.108124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
172.04227
 
0.1%
188.18726
 
0.1%
170.54226
 
0.1%
109.52286
 
0.1%
196.50846
 
0.1%
142.01546
 
0.1%
196.57686
 
0.1%
192.24785
 
0.1%
143.21545
 
0.1%
108.69125
 
0.1%
Other values (5928)8465
99.3%
ValueCountFrequency (%)
31.291
< 0.1%
31.491
< 0.1%
31.891
< 0.1%
31.95582
< 0.1%
32.05581
< 0.1%
32.091
< 0.1%
32.35581
< 0.1%
32.45581
< 0.1%
32.491
< 0.1%
32.65582
< 0.1%
ValueCountFrequency (%)
266.88842
< 0.1%
266.68842
< 0.1%
266.58842
< 0.1%
266.28841
< 0.1%
266.18842
< 0.1%
266.02261
< 0.1%
265.88841
< 0.1%
265.78841
< 0.1%
265.68841
< 0.1%
265.55681
< 0.1%

Outlet_Identifier
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.722280887
Minimum0
Maximum9
Zeros555
Zeros (%)6.5%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:27.566860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q37
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.837201297
Coefficient of variation (CV)0.6008116343
Kurtosis-1.260779927
Mean4.722280887
Median Absolute Deviation (MAD)3
Skewness-0.05986138181
Sum40248
Variance8.049711202
MonotonicityNot monotonic
2021-07-31T11:08:27.726768image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
5935
11.0%
1932
10.9%
6930
10.9%
9930
10.9%
8930
10.9%
7929
10.9%
3928
10.9%
2926
10.9%
0555
6.5%
4528
6.2%
ValueCountFrequency (%)
0555
6.5%
1932
10.9%
2926
10.9%
3928
10.9%
4528
6.2%
5935
11.0%
6930
10.9%
7929
10.9%
8930
10.9%
9930
10.9%
ValueCountFrequency (%)
9930
10.9%
8930
10.9%
7929
10.9%
6930
10.9%
5935
11.0%
4528
6.2%
3928
10.9%
2926
10.9%
1932
10.9%
0555
6.5%

Outlet_Establishment_Year
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1997.831867
Minimum1985
Maximum2009
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:27.884697image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1985
5-th percentile1985
Q11987
median1999
Q32004
95-th percentile2009
Maximum2009
Range24
Interquartile range (IQR)17

Descriptive statistics

Standard deviation8.371760408
Coefficient of variation (CV)0.004190422902
Kurtosis-1.205693917
Mean1997.831867
Median Absolute Deviation (MAD)5
Skewness-0.3966407859
Sum17027521
Variance70.08637233
MonotonicityNot monotonic
2021-07-31T11:08:28.087561image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
19851463
17.2%
1987932
10.9%
1999930
10.9%
1997930
10.9%
2004930
10.9%
2002929
10.9%
2009928
10.9%
2007926
10.9%
1998555
 
6.5%
ValueCountFrequency (%)
19851463
17.2%
1987932
10.9%
1997930
10.9%
1998555
 
6.5%
1999930
10.9%
2002929
10.9%
2004930
10.9%
2007926
10.9%
2009928
10.9%
ValueCountFrequency (%)
2009928
10.9%
2007926
10.9%
2004930
10.9%
2002929
10.9%
1999930
10.9%
1998555
 
6.5%
1997930
10.9%
1987932
10.9%
19851463
17.2%

Outlet_Size
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size66.7 KiB
1
5203 
2
2388 
0
932 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8523
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
15203
61.0%
22388
28.0%
0932
 
10.9%

Length

2021-07-31T11:08:28.512316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-31T11:08:28.704205image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
15203
61.0%
22388
28.0%
0932
 
10.9%

Most occurring characters

ValueCountFrequency (%)
15203
61.0%
22388
28.0%
0932
 
10.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8523
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
15203
61.0%
22388
28.0%
0932
 
10.9%

Most occurring scripts

ValueCountFrequency (%)
Common8523
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
15203
61.0%
22388
28.0%
0932
 
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII8523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15203
61.0%
22388
28.0%
0932
 
10.9%

Outlet_Location_Type
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size66.7 KiB
2
3350 
1
2785 
0
2388 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8523
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row2
3rd row0
4th row2
5th row2

Common Values

ValueCountFrequency (%)
23350
39.3%
12785
32.7%
02388
28.0%

Length

2021-07-31T11:08:29.142953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-31T11:08:29.282873image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
23350
39.3%
12785
32.7%
02388
28.0%

Most occurring characters

ValueCountFrequency (%)
23350
39.3%
12785
32.7%
02388
28.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8523
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
23350
39.3%
12785
32.7%
02388
28.0%

Most occurring scripts

ValueCountFrequency (%)
Common8523
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
23350
39.3%
12785
32.7%
02388
28.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23350
39.3%
12785
32.7%
02388
28.0%

Outlet_Type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size66.7 KiB
1
5577 
0
1083 
3
935 
2
928 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8523
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
15577
65.4%
01083
 
12.7%
3935
 
11.0%
2928
 
10.9%

Length

2021-07-31T11:08:29.620682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-31T11:08:29.760597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
15577
65.4%
01083
 
12.7%
3935
 
11.0%
2928
 
10.9%

Most occurring characters

ValueCountFrequency (%)
15577
65.4%
01083
 
12.7%
3935
 
11.0%
2928
 
10.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8523
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
15577
65.4%
01083
 
12.7%
3935
 
11.0%
2928
 
10.9%

Most occurring scripts

ValueCountFrequency (%)
Common8523
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
15577
65.4%
01083
 
12.7%
3935
 
11.0%
2928
 
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII8523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15577
65.4%
01083
 
12.7%
3935
 
11.0%
2928
 
10.9%

Item_Outlet_Sales
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3493
Distinct (%)41.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2181.288914
Minimum33.29
Maximum13086.9648
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.7 KiB
2021-07-31T11:08:29.943493image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum33.29
5-th percentile188.4214
Q1834.2474
median1794.331
Q33101.2964
95-th percentile5522.811
Maximum13086.9648
Range13053.6748
Interquartile range (IQR)2267.049

Descriptive statistics

Standard deviation1706.499616
Coefficient of variation (CV)0.7823354371
Kurtosis1.615876681
Mean2181.288914
Median Absolute Deviation (MAD)1081.925
Skewness1.177530603
Sum18591125.41
Variance2912140.938
MonotonicityNot monotonic
2021-07-31T11:08:30.192372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
958.75217
 
0.2%
1342.252816
 
0.2%
1845.597615
 
0.2%
703.084815
 
0.2%
1278.33614
 
0.2%
1230.398414
 
0.2%
1416.822413
 
0.2%
1438.12812
 
0.1%
759.01212
 
0.1%
575.251212
 
0.1%
Other values (3483)8383
98.4%
ValueCountFrequency (%)
33.292
 
< 0.1%
33.95581
 
< 0.1%
34.62161
 
< 0.1%
35.28741
 
< 0.1%
36.6192
 
< 0.1%
37.28481
 
< 0.1%
37.95065
0.1%
38.61642
 
< 0.1%
39.9482
 
< 0.1%
40.61382
 
< 0.1%
ValueCountFrequency (%)
13086.96481
< 0.1%
12117.561
< 0.1%
11445.1021
< 0.1%
10993.68961
< 0.1%
10306.5841
< 0.1%
10256.6491
< 0.1%
10236.6751
< 0.1%
10072.88821
< 0.1%
9779.93621
< 0.1%
9678.06881
< 0.1%

Interactions

2021-07-31T11:08:08.004330image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:08.303158image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:08.560018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:08.824860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:09.076714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:09.324574image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:09.551443image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:09.795301image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:10.046157image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:10.285042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:10.502894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:10.735783image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:10.969628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:11.186503image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:11.401381image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:11.636245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:11.865113image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:12.113975image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:12.346836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:12.572726image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:13.431472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:13.655324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:13.869219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:14.089093image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:14.330935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:14.583786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:14.819673image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:15.051518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:15.279387image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:15.492265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:15.698148image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:15.943008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:16.169878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:16.387750image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:16.588636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:16.797518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:17.009414image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:17.193841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:17.378714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:17.574601image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:17.779485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:17.995363image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:18.193267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:18.392153image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:18.604012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:18.797920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:18.985792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:19.226651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:19.439528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:19.665401image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:19.879279image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:20.106149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:20.476954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:20.680838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:20.876724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:21.092601image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:21.316472image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:21.552315image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:21.773209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:21.987087image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:22.200964image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:22.419818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:22.623722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-31T11:08:22.840575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-07-31T11:08:30.409247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-31T11:08:30.810994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-31T11:08:31.198793image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-31T11:08:31.597564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-31T11:08:31.950361image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-31T11:08:23.214359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-31T11:08:23.676116image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Item_IdentifierItem_WeightItem_Fat_ContentItem_VisibilityItem_TypeItem_MRPOutlet_IdentifierOutlet_Establishment_YearOutlet_SizeOutlet_Location_TypeOutlet_TypeItem_Outlet_Sales
01569.30000010.0160474249.8092919991013735.1380
185.92000020.0192781448.269232009122443.4228
266217.50000010.01676010141.6180919991012097.2700
3112119.20000020.0000006182.095001998120732.3800
412978.93000010.000000953.861411987021994.7052
575810.39500020.000000051.400832009122556.6088
669613.65000020.0127411357.658811987021343.5528
773812.85764510.12747013107.7622519851234022.7636
844016.20000020.016687596.9726720021111076.5986
999019.20000020.0944505187.8214220071114710.5350

Last rows

Item_IdentifierItem_WeightItem_Fat_ContentItem_VisibilityItem_TypeItem_MRPOutlet_IdentifierOutlet_Establishment_YearOutlet_SizeOutlet_Location_TypeOutlet_TypeItem_Outlet_Sales
851344912.00020.0204071099.904262004211595.2252
851414515.00020.054489357.590472002111468.7232
851544520.70010.0215180157.5288320091221571.2880
8516135618.60010.1186611158.758832009122858.8820
851738920.75040.0836075178.8318819972013608.6360
85183706.86510.05678313214.5218119870212778.3834
85198978.38020.0469820108.157072002111549.2850
8520135710.60010.035186885.1224620042111193.1136
85216817.21020.14522113103.1332320091221845.5976
85225014.80010.0448781475.467081997201765.6700